77 research outputs found
On the Nondifferential Misclassification of a Binary Confounder
Abstract Consider a study with binary exposure, outcome, and confounder, where the confounder is nondifferentially misclassified. Epidemiologists have long accepted the unproven but oft-cited result that, if the confounder is binary, odds ratios, risk ratios, and risk differences which control for the mismeasured confounder will lie between the crude and the true measures. In this paper the authors provide an analytic proof of the result in the absence of a qualitative interaction between treatment and confounder, and demonstrate via counterexample that the result need not hold when there is a qualitative interaction between treatment and confounder. They also present an analytic proof of the result for the effect of treatment amount the treated, and describe extensions to measures conditional on or standardized over other covariates
Causal inference for social network data
We describe semiparametric estimation and inference for causal effects using
observational data from a single social network. Our asymptotic result is the
first to allow for dependence of each observation on a growing number of other
units as sample size increases. While previous methods have generally
implicitly focused on one of two possible sources of dependence among social
network observations, we allow for both dependence due to transmission of
information across network ties, and for dependence due to latent similarities
among nodes sharing ties. We describe estimation and inference for new causal
effects that are specifically of interest in social network settings, such as
interventions on network ties and network structure. Using our methods to
reanalyze the Framingham Heart Study data used in one of the most influential
and controversial causal analyses of social network data, we find that after
accounting for network structure there is no evidence for the causal effects
claimed in the original paper
Identifying effects of multiple treatments in the presence of unmeasured confounding
Identification of treatment effects in the presence of unmeasured confounding
is a persistent problem in the social, biological, and medical sciences. The
problem of unmeasured confounding in settings with multiple treatments is most
common in statistical genetics and bioinformatics settings, where researchers
have developed many successful statistical strategies without engaging deeply
with the causal aspects of the problem. Recently there have been a number of
attempts to bridge the gap between these statistical approaches and causal
inference, but these attempts have either been shown to be flawed or have
relied on fully parametric assumptions. In this paper, we propose two
strategies for identifying and estimating causal effects of multiple treatments
in the presence of unmeasured confounding. The auxiliary variables approach
leverages auxiliary variables that are not causally associated with the
outcome; in the case of a univariate confounder, our method only requires one
auxiliary variable, unlike existing instrumental variable methods that would
require as many instruments as there are treatments. An alternative null
treatments approach relies on the assumption that at least half of the
confounded treatments have no causal effect on the outcome, but does not
require a priori knowledge of which treatments are null. Our identification
strategies do not impose parametric assumptions on the outcome model and do not
rest on estimation of the confounder. This work extends and generalizes
existing work on unmeasured confounding with a single treatment, and provides a
nonparametric extension of models commonly used in bioinformatics
Augmented balancing weights as linear regression
We provide a novel characterization of augmented balancing weights, also
known as Automatic Debiased Machine Learning (AutoDML). These estimators
combine outcome modeling with balancing weights, which estimate inverse
propensity score weights directly. When the outcome and weighting models are
both linear in some (possibly infinite) basis, we show that the augmented
estimator is equivalent to a single linear model with coefficients that combine
the original outcome model coefficients and OLS; in many settings, the
augmented estimator collapses to OLS alone. We then extend these results to
specific choices of outcome and weighting models. We first show that the
combined estimator that uses (kernel) ridge regression for both outcome and
weighting models is equivalent to a single, undersmoothed (kernel) ridge
regression; this also holds when considering asymptotic rates. When the
weighting model is instead lasso regression, we give closed-form expressions
for special cases and demonstrate a ``double selection'' property. Finally, we
generalize these results to linear estimands via the Riesz representer. Our
framework ``opens the black box'' on these increasingly popular estimators and
provides important insights into estimation choices for augmented balancing
weights
- …